The MPI + CUDA Gaia AVU–GSR Parallel Solver Toward Next-generation Exascale Infrastructures
نویسندگان
چکیده
We ported to the GPU with CUDA Astrometric Verification Unit-Global Sphere Reconstruction (AVU-GSR) Parallel Solver developed for ESA Gaia mission, by optimizing a previous OpenACC porting of this application. The code aims find, [10,100]$\mu$as precision, astrometric parameters $\sim$$10^8$ stars, attitude and instrumental settings satellite, global parameter $\gamma$ parametrized Post-Newtonian formalism, solving system linear equations, $A\times x=b$, LSQR iterative algorithm. coefficient matrix $A$ final dataset is large, $\sim$$10^{11} \times 10^8$ elements, sparse, reaching size $\sim$10-100 TB, typical Big Data analysis, which requires an efficient parallelization obtain scientific results in reasonable timescales. speedup over original AVU-GSR solver, parallelized on CPU MPI+OpenMP, increases number resources, maximum $\sim$14x, >9x This result obtained comparing two codes CINECA cluster Marconi100, 4 V100 GPUs per node. After verifying agreement between solutions set systems different sizes computed OpenMP that showed required was put production essential optimal pipeline successive Releases. analysis represents first step understand (pre-)Exascale behavior class applications follow same structure code. In next months, we plan run pre-Exascale platform Leonardo CINECA, next-generation A200 node, toward infrastructure, where expect even higher performances.
منابع مشابه
Toward Parallel CFA with Datalog, MPI, and CUDA
We present our recent experience working to design parallel functional control-flow analysis (CFA) using an encoding in Datalog and underlying relational algebra implemented for SIMD coprocessors and supercomputers. Control-flow analysis statically models the possible propagations of data and control through a target program, finitely obtaining a bound on reachable expressions and environments ...
متن کاملNVIDA CUDA Architecture-Based Parallel SAT Solver
The SAT problem is the first NP-complete problem. So far there is no algorithm that can solve it in polynomial time. Over the past decade, the development of efficient and scalable algorithms has dramatically leveraged the ability of solving SAT problem instances involving tens of thousands of variables and millions of constraints. But as industry demand is increasing, a faster SAT solver is ne...
متن کاملMPI at Exascale
With petascale systems already available, researchers are devoting their attention to the issues needed to reach the next major level in performance, namely, exascale. Explicit message passing using the Message Passing Interface (MPI) is the most commonly used model for programming petascale systems today. In this paper, we investigate what is needed to enable MPI to scale to exascale, both in ...
متن کاملTowards next generation coordination infrastructures
Coordination infrastructures play a central role in the engineering of multiagent systems. Since the advent of agent technology, research on coordination infrastructures has produced a significant number of infrastructures with varying features. In this paper, we review the the state of the art coordination infrastructures with the purpose of identifying open research challenges that next gener...
متن کاملTowards Exascale Parallel Delaunay Mesh Generation
Mesh generation is a critical component for many (bio-)engineering applications. However, parallel mesh generation codes, which are essential for these applications to take the fullest advantage of the high-end computing platforms, belong to the broader class of adaptive and irregular problems, and are among the most complex, challenging, and labor intensive to develop and maintain. As a result...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Publications of the Astronomical Society of the Pacific
سال: 2023
ISSN: ['0004-6280', '1538-3873']
DOI: https://doi.org/10.1088/1538-3873/acdf1e